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Effective Reading Programs for English Language Learners 
and Other Language Minority Students 

Abstract 

This article systematically reviews research on elementary reading programs for 
English language learners and other language minority students. It focuses on studies 
that compared experimental and control groups on quantitative reading measures. 
Among beginning reading models, research supported structured, phonetic programs 
emphasizing language development, in both native-language and English instruction. 
Tutoring programs were also supported. For upper-elementary reading, research 
supported a broad range of programs, but particularly effective were programs using 
cooperative learning, extensive vocabulary instruction, and use of literature. 
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For many years, the focus of policy debates relating to the reading education of 
English language learners (ELLs) has been on the question of language of instruction, 
contrasting bilingual and English-only approaches. As important as language of 
instruction is, however, there has been a growing recognition in recent years that quality 
of instruction is at least as important as language of instruction in the ultimate success of 
ELLs (see, for example, August & Hakuta, 1997; Brisk, 1998; Christian & Genessee, 
2001; Goldenberg, 1996; Secada et al., 1998). 

Research on language of instruction, reviewed most recently by Greene (1997) 
and Slavin & Cheung (in press), has generally found that bilingual programs are more 
effective than English-only programs. Slavin & Cheung (in press) found particularly 
strong evidence favoring paired bilingual programs, in which students are taught to read 
both in their native language and in English beginning in kindergarten or first grade, a 
strategy typically seen in two-way bilingual programs. However, in today’s political 
environment, the language of reading instruction is likely to be determined by factors 
beyond the control of individual educators. Whatever the language of instruction may be, 
educators concerned with ELLs need programs known to be effective with these students. 

Quality of instruction is the product of many factors, including the quality of 
teachers, class size, and other resources. One factor is the program of instruction used 
each day to teach reading. A number of coherent, replicable reading programs combining 
materials and professional development have been developed and used with English 
language learners. This article reviews research on reading programs for English 
language learners and other language minority students in an attempt to apply consistent, 
well-justified standards of evidence to draw conclusions about which of these programs 
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are effective for these children. The review applies a technique called “best-evidence 
synthesis” (Slavin, 1986), which seeks to apply consistent, clear standards to identify 
unbiased, meaningful information from experimental studies and then discusses each 
qualifying study, computing effect sizes but also describing the context, design, and 
findings of each study. Best-evidence synthesis closely resembles meta-analysis, but it 
requires more extensive discussion of key studies. Details of this procedure are described 
later. The purpose of this review is to examine the quantitative evidence on replicable 
reading programs for English language learners and other language minority students to 
discover how much of a scientific basis there is for competing claims about effects of 
various programs. Our purpose is to inform practitioners, policymakers, and researchers 
about the current state of the evidence on this topic as well as gaps in the knowledge base 
in need of further scientific investigation. 

Review Methods and Criteria for Inclusion 

Review methods for studies of reading programs for English language learners 
and other language minority students were as follows. 

1. The studies involved elementary (K-6) children identified as ELL or language 
minority (e.g., “Hispanic”) in English-speaking countries. 

2. The studies compared children taught in classes using a given reading program to 
those in control classes using standard textbooks. 

3. The language of instruction was the same in experimental and control groups. 

4. Random assignment or matching with appropriate adjustments for any pretest 
differences had to be used. Studies without control groups, such as pre-post 
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comparisons and comparisons to “expected” gains, were excluded, as were studies 
with pretest differences of more than one standard deviation. 

5. The dependent measures included quantitative measures of reading performance, 
such as standardized reading measures. In all cases, measures included 
assessments of comprehension, not just phonics or decoding. The focus on 
quantitative measures was intended to allow for comparable, objective 
conclusions about program effects across studies. 

6. A minimum treatment duration of 12 weeks was required. 

Studies of Beginning Reading Programs 
It is in the earliest years of formal education that children define themselves as 
learners, largely on the basis of reading success. The early elementary years are of 
particular importance for English language learners, as this is the time when they are 
most likely to be struggling both to learn a new language and to learn to read. Perhaps 
because of this, the largest number of methodologically adequate studies have focused on 
the early elementary grades. Studies in this section are ones in which the treatments 
begin in kindergarten or first grade. 

There were 13 studies of beginning reading that met the criteria outlined above. 
Most studies of reading approaches for English language learners and other language 
minority students lacked control groups or objective measures, did not document or 
control for pretest differences, or were very brief. The main characteristics and findings 
of the qualifying studies are summarized in Table 1. 
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TABLE 1 HERE 



Success for All 

Among the beginning reading studies that met the inclusion criteria, six evaluated 
the Success for All program (Slavin & Madden, 1999, 2001). Success for All is a 
comprehensive reform model that provides schools with well- structured curriculum 
materials emphasizing systematic phonics in grades K-l, and cooperative learning, direct 
instruction in comprehension skills, and other elements in grades 2-6. It also provides 
extensive professional development and followup for teachers, frequent assessment and 
regrouping, one-to-one tutoring for children who are struggling in reading, and family 
support programs. A full-time facilitator helps all teachers implement the model. 

For English language learners, Success for All has two variations. One is a 
Spanish bilingual program, Exito Para Todos, which teaches reading in Spanish in 
Grades 1-2 and then transitions them to English-only instruction, usually starting in third 
grade. The other is an English language development (ELD) adaptation, which teaches 
children in English with appropriate supports, such as vocabulary development strategies 
linked to the words introduced in children’s reading texts. 

Studies of Success for All with English language learners and other language 
minority students have compared children taught using the Spanish adaptation to other 
children taught in Spanish, or have compared the ELD adaptation to other ELD English 
reading programs. 
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Success for All: Spanish Bilingual Adaptation (Exito Para Todos ) 

California (Bilingual). Researchers at the Southwest Educational Research 
Laboratory (now part of WestEd) conducted a three-year longitudinal study involving 
three California elementary schools and three matched controls. They pooled data across 
the schools in four categories: English-dominant students, Spanish-dominant students 
taught in Spanish, Spanish-dominant students taught in English, and speakers of 
languages other than English or Spanish taught in English. Three cohorts were followed. 
Data for a 1992 cohort were reported for grades 1, 2, and 3; for 1993, grades 1 and 2; and 
for 1994, grade 1 only. 

Students in the two Exito Para Todos schools in California scored higher on the 
Spanish Woodcock than controls at every grade level in all three cohorts (Livingston & 
Flaherty, 1997, Dianda & Flaherty, 1995). Median effect sizes across cohorts averaged 
+0.97 for first graders, +0.44 for second graders, and +0.03 for third graders. The 
analyses for second and third graders understate the magnitude of the differences. In line 
with district and program policies, students initially taught in Spanish were transitioned 
into English instruction as soon as they demonstrated an ability to succeed in English. 
Because of their success in Spanish reading, many more Exito Para Todos than control 
students were transitioned during second and third grades. Therefore, the highest- 
achieving experimental students were being removed from the Spanish sample, reducing 
the mean for this group. (This is a common problem in studies of transitional bilingual 
education.) 
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Houston (Bilingual). The largest study of Exito Para Todos took place in the 
Houston Independent School District (HISD). Both Spanish and English forms of 
Success for All were studied (see Nunnery, Slavin, Madden, Ross, Smith, Hunter, & 
Stubbs, 1997). 

The bilingual portion of the study compared first graders in 20 schools 
implementing Exito Para Todos to those in 10 matched schools also using Spanish 
bilingual instruction. Children were assessed on three scales from the Spanish 
Woodcock: Word Identification, Word Attack, and Passage Comprehension. Ten 
children were selected at random from each school; after missing data were removed, 
there were 298 Spanish-dominant students across the 30 schools with bilingual programs. 

School-level comparisons showed significant differences (p < .05) between SFA 
schools and comparison schools on Word Identification and Word Attack. Overall, the 
median student-level effect size in comparison to controls was +0.22. 

Success for All: English Language Development Adaptation 

Philadelphia (ELD). The first evaluation of the English language development 
(ELD) adaptation of Success for All took place in a Philadelphia school (Slavin & 
Madden, 1995). Sixty-two percent of Key’s students were from Asian backgrounds, 
primarily Cambodian. Nearly all of them entered the school in kindergarten with little or 
no English. The remainder of the school was divided between African American and 
White students. 

The program was evaluated in comparison to a matched Philadelphia elementary 
school. The two schools were very similar in overall achievement level and other 
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variables. All students in grades 4-5, most of whom had been in their respective 
programs since kindergarten, were individually administered three scales from the 
Woodcock Language Proficiency Battery (Woodcock, 1984): Word Identification, Word 
Attack, and Passage Comprehension. Asian Success for All students at both grade levels 
performed substantially better than Asian control students. The differences were 
statistically significant on every measure at both grade levels (pc.OOl). 

California (ELD). The three-year California study (Livingston & Flaherty, 1997; 
Dianda & Flaherty, 1995) included data on English language learners taught in English. 
These included both students in one Modesto school that did not have a bilingual 
program, as well as ELLs in the two schools (one in Modesto and one in Riverside) who 
were speakers of languages other than English or Spanish. 

Results for Spanish-dominant students taught in English showed strong impacts 
for first graders (ES = +1.36), smaller ones for second graders (ES = +0.46), and no 
differences for third graders (ES = -0.09). Again, the transitioning of successful students 
out of ESL classes reduced the apparent differences by third grade (because the highest 
achieving students were no longer receiving ESL services). 

Results for speakers of languages other than English or Spanish (taught in 
English) were similar to those for Spanish-dominant ESL students. Averaging across 
cohorts, effect sizes were +0.24 for first graders, +0.37 for second graders, and +0.05 for 
third graders (Livingston & Flaherty, 1997; Dianda & Flaherty, 1995). 

Arizona (ELD). Another study of the ELD adaptation of Success for All in 
schools serving many students acquiring English took place in an Arizona school district 
(Ross, Smith, & Nunnery, 1998). This one-year study compared first graders in two 
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Success for All schools to those in four schools using locally developed Title I 
schoolwide projects. Students were pretested on the English Peabody Picture Vocabulary 
Test (PPVT) and then posttested on the Woodcock Word Identification, Word Attack, 
and Passage Comprehension scales, and the Durrell Oral Reading Test. Analyses of 
covariance found that Hispanic Success for All students scored significantly higher than 
control students on all measures (ES=+0.52). 

Texas Statewide Evaluation. Hurley, Chamberlain, Slavin, & Madden (2001) 
reported an analysis of data from the Texas Assessment of Academic Skills (TAAS), 
comparing reading gains (from the year schools began to implement Success for All to 
1998) by all 111 Success for All schools in the state to those made by students throughout 
Texas. The comparisons involving Hispanic students are relevant to this review. Note 
that while the TAAS data were for grades 3-5, most of the students had been in the 
program three to four years, meaning that they had begun in grades K-2. 

Ninety-five of the Success for All schools had enough Hispanic students in grades 
3-5 to be included in the analysis. Analyzing at the school level, their TAAS reading 
gains were significantly greater (pc. 007) than those for Hispanic students in the state as a 
whole. Hispanic students in the SFA schools and state means for Hispanic students were 
similar in the year before SFA was introduced. The effect size for school means was 
+0.28. 

Success for All with Embedded Video. Chambers, Slavin, Madden, Cheung, & 
Gifford (2004a) carried out a study of an adaptation of Success for All that incorporated 
embedded video. Four types of video material were used: animations to present letter 
sounds, puppet vignettes to present sound blending, live-action skits to present 
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vocabulary, and a variety of segments from the television program Between the Lions to 
reinforce various skills. The brief video segments were interspersed in teacher’s lessons 
in grades K-l. Hispanic students were expected to benefit in particular from the SFA and 
embedded video treatment because the videos included vocabulary presentations and 
clear visual reinforcements of reading skills. Hispanic students in four schools in 
different parts of the U.S. were compared to matched students in similar schools that did 
not use Success for All or embedded video. A year-long study involving 311 
experimental and 144 control students in grades K-l found that, controlling for Peabody 
Picture Vocabulary Tests (PPVT), schools using Success for All with embedded video 
scored significantly higher than controls on Woodcock Word Identification (ES= +0.40), 
Word Attack (ES= +0.36), and Passage Comprehension (ES=+0.21). 

Success for All: Conclusions 

The effects of Success for All on the achievement of English language learners 
and other language minority students are not entirely consistent, but in general they are 
substantially positive. Across two studies of Exito Para Todos , the Spanish bilingual 
adaptation of Success for All, the median effect sizes on Spanish assessments was +0.41. 
Across five studies of the ELD adaptation of Success for All, the median effect size was 
+0.37. 

Embedded Video 

A recent study compared Success for All schools using the embedded video 
materials described above to schools also implementing Success for All but without the 
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embedded videos (Chambers, Madden, Cheung, Gifford, & Slavin, 2004b). Because all 
schools used SFA, this was not a study of Success for All, but of the added embedded 
video treatment. Ten majority-Hispanic schools in inner-city Hartford, CT were 
randomly assigned to SFA + embedded video or SFA-only (control) conditions for a one- 
year experiment. Results for Hispanic children, who were 66% of the sample, found 
positive effects controlling for the PPVT and the Woodcock Word Identification scale on 
Woodcock Word Identification (E=+0.23) and Word Attack (ES=+0.36). 

Direct Instruction 

Direct Instruction (DI), or Distar (Adams & Engelmann, 1996), is a reading 
program that starts in kindergarten with very specific instructions to teachers on how to 
teach beginning reading skills. It uses reading materials with a phonetically controlled 
vocabulary, rapidly paced instruction, regular assessment, and systematic approaches to 
language development. Like Success for All, DI provides extensive professional 
development and coaching to all teachers. DI was not specifically written for English 
language learners or Latino students, but it is often used with them. 

The most important evaluation of DI was the Follow Through study of the 1970s, 
in which nine early literacy programs were evaluated (Stebbins, St. Pierre, Proper, 
Anderson, & Cerva, 1977). In sites throughout the U.S., matched experimental and 
control schools were compared on various measures of reading. 

One of the sites was in Uvalde, Texas, which primarily served Hispanic students. 
Becker & Gersten (1982) carried out a followup of the Follow Through study when the 
children who had experienced the treatments in grades K-3 were in grades 5-6. They 
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found that the Uvalde DI students, who were well matched on demographic factors with 
their control group, scored substantially better than the controls. Effect sizes averaged 
+0.47 for two scales of the individually administered WRAT and +0.16 across three 
Metropolitan Achievement Test (MAT) subscales, for a median across five tests in two 
grades of ES = +0.21. 

Gersten (1985) evaluated DI as part of a structured immersion program for limited 
English proficient students who spoke a variety of Asian languages. In addition to the DI 
beginning reading program, the structured immersion model emphasized English at a 
level understood by the students, occasional translation, preteaching of vocabulary, and 
direct teaching of the structure of the English language. Students in a matched control 
group participated in programs whose characteristics were not described, but which also 
primarily taught in English. 

Across two cohorts, 75% of DI students scored at or above grade level on the 
CTBS Total Reading Scale at the end of two years, while only 19% of comparison 
students were at or above grade level (pc. 001). 

Jolly Phonics (Systematic Phonics Instruction) 

Stuart (1999) carried out an evaluation of Jolly Phonics, an English phonetic 
kindergarten reading program, in five London primary schools. This program was 
compared to a big books program emphasizing teaching by drawing children’s attention 
to letters and words in popular children’s stories. The subjects were mostly English 
language learners, and among these most were speakers of Sylheti (a language of 
Bangladesh). Most subjects were 5-year-olds. One teacher in each school volunteered to 
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implement either Jolly Phonics (JP) or Big Books (BB). The JP and BB schools were 
well matched on most variables including free meals and academic performance, but the 
JP schools had many more children at beginning ESL levels (53% vs. 30%). 

The interventions took place one hour per day for 12 weeks. The results strongly 
favored the JP group. Effect sizes for five gain scores measures of phonemic awareness 
and phonics knowledge had a median value of +0.70 at posttest and +0.16 on a delayed 
posttest one year later. On five measures of reading and writing, the median effect size 
for gain scores was +1.06 at the end of the experiment and +0.52 one year later. 

Reading Recovery /Descubriendo la Lectura 

Reading Recovery is an early intervention tutoring program for young readers 
who are experiencing difficulty in their first year of reading instruction (Clay, 1993). The 
program provides the lowest achieving readers (lowest 20%) in first grade with 
supplemental tutoring in addition to their regular reading classes. Children participating 
in Reading Recovery receive daily one-to-one 30-minute lessons for 12-20 weeks with a 
certified, specially trained teacher. The lessons include assessment, reading known 
stories, reading a story that was read once the day before, writing a story, working with a 
cut-up sentence, and reading a new book. Descubriendo la Lectura (DLL), the Spanish 
adaptation of Reading Recovery, is equivalent in all major aspects to the original 
program. There have been many evaluations comparing Reading Recovery and control 
students, including a large-scale, randomized evaluation in Ohio (Pinnell, Lyons, Deford, 
Bryk, & Seltzer, 1994). Only one study involving English language learners met the 
inclusion standards of this review. This was a 7-month evaluation of DLL conducted by 
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Escamilla (1994) in Tucson. The experiment compared 23 DLL students to 23 matched 
comparison students also taught in Spanish in another school. In both cases, students 
were identified as being in the lowest 20% of their classes based on individually 
administered tests and teacher judgment. The two groups were well matched on the 
Spanish Aprenda. The outcomes of DLL on Spanish reading measures at the end of first 
grade were very positive. On six scales of a Spanish Observation Survey adapted from 
the measures used in evaluations of the English Reading Recovery program, DLL 
students started out below controls and ended the year substantially ahead of them, with a 
median effect size of +0.84. 

Small Group Tutorials with Direct Instruction 

Gunn, Biglan, Smoklowski, & Ary (2000) evaluated a small group tutorial 
program that used two forms of DI, Reading Mastery and Corrective Reading, as a 
supplementary intervention for Hispanic and non-Hispanic children who were struggling 
in reading. The children were in kindergarten to third grade, and were selected either 
because they scored at a very low level on an achievement measure or because they were 
rated by their teachers as being high in aggressive behavior (and were below grade level 
in reading). Children were selected from nine rural Oregon elementary schools. They 
were randomly assigned to experimental or control conditions. Those children assigned 
to the experimental group were taught in homogeneous groups of one to three children 
using Reading Mastery if they were in grades K-2, or Corrective Reading if they were in 
grades 3-4. They were taught daily by instructional assistants for two years. Only 19 of 
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the 122 Hispanic students were considered non-English speaking; the oral English skills 
of the remaining students were not specified. 

The experimental and control groups were very well matched on the Woodcock- 
Johnson Letter Word Identification and Word Attack scales, and on Oral Reading 
Fluency. After the first year, tutorial students who had received five to six months of 
supplementary instruction showed greater gains than control students on all three 
measures, Letter-Word ID (ES=+0.22), Word Attack (ES=+0.70), and Fluency 
(ES=+0.16). Only the Word Attack differences were significant. At the end of the 
second year, after 15-16 months of instruction, effect sizes for gains from pretest on these 
measures were +0.46, +0.91, and +0.43, respectively. In addition, there were positive 
effects on Woodcock Reading Vocabulary (ES=+0.44) and Passage Comprehension 
(ES=+0.48), given as posttests only. Experimental-control differences on all five 
measures were significant after two years. 

Libros 

Goldenberg (1990) studied a school and home reading intervention for Spanish- 
dominant kindergartners. The intervention, called Libros, involved teachers introducing 
and extensively discussing a Spanish story and then sending home photocopied “books” 
with children once every three weeks through kindergarten. Parents were encouraged to 
read with their children and were shown a videotape of a parent reading and discussing 
the story. In control classrooms, teachers sent home worksheets on letters and syllables. 
Children in four classrooms using Libros were matched with those in four control 
classrooms based on Bilingual Syntax Measure scores. On an experimenter-constructed 
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set of 13 Spanish early literacy assessments at the end of the year, experimental children 
scored significantly higher than controls (median ES = +0.51). Effects were strongest on 
measures of letter and word identification, but were less positive on comprehension 
measures. 



Studies of Upper Elementary Reading Programs 
Several studies have evaluated reading programs for English language learners in 
grades 2-5. Seven of these met the inclusion criteria. These are summarized in Table 2 
and described in the following sections. 



TABLE 2 HERE 



Bilingual Cooperative Integrated Reading and Composition (BCIRC) 

An experiment by Calderon, Hertz-Lazarowitz, & Slavin (1998) evaluated a 
cooperative learning program called Bilingual Cooperative Integrated Reading and 
Composition, or BCIRC. BCIRC is an adaptation of Cooperative Integrated Reading and 
Composition, an upper elementary reading program based on principles of cooperative 
learning that has been successfully evaluated in several studies (see Stevens, Madden, 
Slavin, & Famish, 1987). BCIRC was adapted to meet the needs of limited English 
proficient children in bilingual programs who are transitioning from Spanish to English 
reading. In CIRC and BCIRC, students work in four-member heterogeneous teams. 
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After a teacher introduction, students engage in a set of activities related to a story they 
are reading. These include partner reading in pairs, and team activities focused on 
vocabulary, story grammar, summarization, reading comprehension, creative writing, and 
language arts. BCIRC adds to these activities transitional readers (in this study, 
Macmillan’s Campanitas de Oro and Transitional Reading Program ), and ESL 
strategies, such as total physical response, realia, and appropriate use of cognates, to help 
children transfer skills from Spanish to English reading. 

Control teachers also used the same Campanitas de Oro and Transitional Reading 
Program textbooks, and received training in generic cooperative learning strategies. 

None of the control teachers used cooperative learning consistently, although all of them 
made occasional use of these strategies. 

The BCIRC study involved 222 Hispanic children in the Ysleta Independent 
School District in El Paso, Texas. Seven of the highest-poverty schools in the district 
were assigned to experimental (3 schools) or control (4 schools) conditions. As a whole, 
the experimental and control schools were well matched demographically. Two cohorts 
were assessed, one of which was involved for just one year (second grade) and the other 
for two years (grades 2-3). Analyses of covariance controlling for Bilingual Syntax 
Measure scores found significantly higher scores for students in BCIRC classes in both 
cohorts. 

Bilingual Transition with Success for All 

An experiment by Calderon, August, Slavin, Duran, Madden, & Cheung (2004) 
evaluated an enriched transition program for children who had been taught in Spanish 
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using Success for All and were moving to the English program in third grade. The 
enriched program, a descendent of BCIRC, included an English phonics program called 
FastTrack Phonics, rapidly presented components of the Success for All beginning 
reading (Reading Roots) program including the embedded videos described earlier, and 
explicit instruction in vocabulary using strategies similar to those used by Carlo et al. 
(2004). The experiment compared students in El Paso, TX, who received the full 
program to matched students in similar control schools. After one year, students in the 
program scored higher than control students (controlling for Spanish and English 
Woodcock Scales) on Woodcock Word Attack (ES=+0.21), Passage Comprehension 
(ES=+0.16), and Picture Vocabulary (ES=+0.11). Experimental students scored higher 
on some of the Spanish measures as well. 

Enriched Transition 

Saunders & Goldenberg (1996) evaluated a program designed to help English 
language learners transition from Spanish to English. The treatment focused on literature 
study, writing, discourse, skill building, reading comprehension strategies, independent 
reading, teacher read-alouds, and other elements. These treatments were applied to 
second and fifth graders in transitional bilingual education (TBE) and English-only 
classes. In each case, a control group was matched with the experimental group. Over a 
year, the English-only experimental group scored higher than control groups on an 
English reading measure in second grade (ES=+0.34), but not in fifth grade (ES=+0.03). 
Second grade TBE students, tested in Spanish, scored substantially better in the 
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experimental condition (ES=+1.36). Fifth-grade experimental TBE students tested in 
English also showed substantially higher achievement (ES=+0.68). 

The Saunders & Goldenberg (1996) article only reported on the first year of a 
three-year transition project. A study of the full program was described by Saunders 
(1998). It compared children in the three-year transition program (using the methods 
described above) to those in a three- to six-month transition, the usual treatment for ELLs 
in the district studied. On Spanish measures, differences were insignificant in grade 1 
(ES= -0.02) and grade 2 (ES=+0.26), but significant in grade 3 (ES=+0.38) and grade 4 
(ES=+0.59). In a Cantonese-dominant subgroup, experimental students scored 
significantly higher on English tests (grade 4, ES=+0.53; grade 5, ES=+0.80). At fifth 
grade, an early-transitioning group was tested in English and a late-transitioning group 
was tested in Spanish. In both cases, effects favored the experimental group (ES=+0.50 
for English, ES=+0.92 for Spanish). Similar effects were seen on performance measures 
of reading and writing, and experimental students passed a test used as a criterion for 
placement in English-only instruction at much higher rates than did controls. 

Vocabulary Intervention 

Carlo, August, McLaughlin, Snow, Dressier, Lippman, Lively, & White (2004) 
carried out a two-year evaluation of a vocabulary teaching intervention with Spanish- 
dominant fourth and fifth graders in California, Massachusetts, and Virginia. The 
intervention involved introducing 12 vocabulary words each week using a variety of 
strategies, such as charades, 20 questions, discussions of Spanish cognates, word webs, 
and word association games. 
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The experimental students were taught in one five-week unit and two six-week 
units in the first year, and three five-week units in the second year. Matched control 
students continued their usual instruction. Experimental and control students were not 
significantly different on any of an extensive set of measures. 

At the end of the first year, ELLs showed greater gains from pretest than controls, 
but surprisingly, gains were lower after two years of intervention 

Perez (1981) evaluated an oral language intervention with Spanish-dominant third 
graders in Texas. The intervention consisted of daily 20-minute sessions in which 
children worked with humorous language games, pictures, and other activities intended to 
build their oral language. The experimental group of 75 students was compared to a 
well-matched control group. On an unspecified reading measure, the experimental group 
scored substantially higher. 

Tutoring 

Two types of one-to-one tutoring for English language learners were studied in an 
experiment by Denton, Anthony, Parker, & Hasbrouck (2004). Spanish-dominant 
students in grades 2-5 in a bilingual program in Texas were assigned to one of two 
separate experiments. Those scoring lower than the first-grade level on the Woodcock 
Word Attack scale were randomly assigned to a program called “Read Well” (Sprick, 
Howard, & Fidanque, 1998), or to an untutored control group. Those scoring higher than 
this were randomly assigned to a tutoring program called “Read Naturally” or to an 
untutored control group. Read Well uses systematic phonics instruction and practice in 
fully decodable text (like the first-grade instruction in Success for All). Read Naturally 
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(Ihnot, 1992) emphasized repeated readings of connected text, vocabulary, and 
comprehension instruction. Tutors were undergraduate education majors. All tutoring 
was done in English. The final sample of students in the Read Well evaluation included 
19 experimental and 14 control children. Those in the experimental group received an 
average of 22 tutoring sessions. In the Read Naturally comparison, there were 32 tutored 
and 28 non-tutored children. 

The results indicated substantially higher achievement for the Read Well students 
than for controls, with a median effect size of +0.51 across six measures. Differences 
were statistically significant only on the Woodcock Word Attack scale (p<.016) and an 
oral reading accuracy scale (pc. 001). In contrast, there were no differences between the 
children tutored with Read Naturally and those who were not tutored (ES= +0.08). 

Conclusions: Studies of Reading 

The research summarized in this article shows how much remains to be done on 
effective reading programs for English language learners and other language minority 
students. Only a handful of studies met the minimal inclusion standards applied in this 
review, which principally required an experimental-control comparison of a reading 
program over at least 12 weeks, with evidence that the two groups were equivalent at 
pretest. 

Beginning Reading. Among the 13 studies of interventions beginning in 
kindergarten or first grade that met the inclusion standards, the evidence supported 
structured, phonetic programs emphasizing language development, in both native- 
language and English instruction. The largest number of studies involved Success for All, 
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a comprehensive reform model (Slavin & Madden, 1999). Two studies of Success for 
All in its Spanish bilingual form found consistent positive effects on students’ Spanish 
reading performance, with a median effect size of +0.41 (in comparison to schools 
teaching in Spanish using alternative methods). Similarly, five studies of schools using 
the English language adaptation of Success for All with Latino and Asian English 
language learners found positive effects, with a median effect size of +0.37. 

Studies evaluating Success for All with embedded video materials found positive 
effects of the combined program for Hispanic students (Chambers et al., 2004a) and 
found that the embedded videos added significantly to the effects (Chambers et al., 
2004b). 

Two longitudinal studies found strong and lasting effects of Direct Instruction 
(DI) on the reading achievement of language minority students. One was a followup of 
mostly Hispanic fifth and sixth graders in Texas who had experienced DI in grades K-3 
(Becker & Gersten, 1982). The other was a two-year study of DI in a structured 
immersion program for Asian English language learners (Gersten, 1985). An adaptation 
of DI for use in small-group tutorials (1-3 children) also found positive effects (Gunn et 
al., 2000). 

No other beginning reading program had more than a single methodologically 
adequate study. A study of a systematic phonics program called Jolly Phonics (Stuart, 
1999) found promising effects among children of Bangladeshi origin in London, but the 
study had serious problems with pretest differences. Very positive effects were 
documented in a study of a Spanish adaptation of Reading Recovery (Escamilla, 1994). A 
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study of Libros, a home and school literature approach using Spanish reading materials, 
documented benefits for ELL kindergartners (Goldenberg, 1990). 

Upper Elementary Reading. Seven studies of reading in grades 2-5 met the 
inclusion criteria. The evidence generally supported programs that make extensive use of 
cooperative learning, vocabulary instruction, and literature. A two-year evaluation of 
Bilingual Cooperative Integrated Reading and Composition (BCIRC; Calderon et al., 
1998), a cooperative learning strategy, found strong positive effects on the Spanish and 
English reading of children transitioning from Spanish to English reading in grades 2-3. 

A similar treatment, an enriched Spanish-to-English transition program based on Success 
for All, also showed significantly positive effects on English reading performance 
(Calderon et al., 2004). Saunders (1998) and Saunders & Goldenberg (1999) 
successfully evaluated an enriched transition process for ELLs moving to English-only 
instruction. Carlo et al. (2004) found positive effects of an English vocabulary 
intervention for ELL fourth and fifth graders on various experimenter-made measures of 
vocabulary skill, and Perez (1981) found that instruction in oral English skills improved 
the reading skills of ELL third graders. Denton et al. (2004) evaluated two tutoring 
approaches and found that Read Well, a phonetic program, improved the English reading 
of very low achieving ELLs. 

It is important to note that the programs with the strongest evidence of 
effectiveness in this review are all programs that have also been found to be effective 
with students in general: Success for All (Slavin & Madden, 2000, 2001), Direct 
Instruction (Adams & Engelmann, 1996); Reading Recovery (Pinnell et al, 1994), and 
phonetic tutoring (e.g., Wasik & Slavin, 1993). In fact, several of the studies evaluating 
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Success for All (e.g., Nunnery et al. 1997; Livingston & Flaherty, 1997; Ross et al., 

1998) as well as DI (Gunn et al., 2000), also included non-ELL students, and in each case 
those students also gained from the interventions, to about the same degree. The 
beginning reading programs with the strongest evidence of effectiveness in this review 
made use of systematic phonics, such as Success for All, Direct Instruction, and Jolly 
Phonics, but systematic phonics has been identified as a component of effective 
beginning reading programs for English proficient students as well (see National Reading 
Panel, 2000; Gersten & Geva, 2003). Typically, programs originally designed for use 
with English proficient students are considerably adapted for use with ELLs, with more 
emphasis on vocabulary and oral language (see Fitzgerald, 1995; Slavin & Calderon, 
2001 ). 

While we do have a good start on research in several areas, there is much more to 
be done. Large-scale, randomized, longitudinal evaluations of well-justified approaches 
are needed to more confidently recommend effective strategies for English language 
learners and other language minority students of all ages and backgrounds. Research 
systematically varying program components and research combining quantitative and 
qualitative methods are needed to more fully understand how various interventions affect 
the development of reading skills among English language learners. It is time to end the 
ideological debates, and to instead focus on good science, good practice, and sensible 
policies for children whose success in school means so much to themselves, their 
families, and our nation’s future. 
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TABLE 1 

Beginning Reading Programs: Descriptive Information and Effect Sizes for Qualifying Studies 



Study 


Intervention 

Description 


Design 


Duration 


N 


Grade 


Sample 

Characteristics 


Evidence of Initial 
Equality 


Posttest 


Effect 

Size 


Median ES 


Success For All 


Nunnery et al 
(1997) 


SFA-Bilingual 


Matched 

control 


1 yr 


298 in 30 
schools 


1 


Spanish-dominant 
students across 30 
schools with bilingual 
programs in Houston 
TX 


Fairly well matched 
on demographic and 
well matched on 
pretest. C>E; ES=- 
0.08 


Span. Woodcock 






Word Identification 


+0.24 


+0.22 


Word Attack 


+0.26 


Passage 

Comprehension 


+0.20 


Livingston & 
Flaherty 
(1997) 


SFA-Bilingual 


Matched 

control 


3 yrs 


6 schools 
(3 E & 3 C) 


1-3 


Spanish-dominant 
bilingual students in CA 


Well matched on 
demographics and 
PPVT pretests. 


Span. Woodcock 






Grade 1 




+0.97 


Grade 2 




+0.44 


Grade 3 




+0.03 


SFA-English Language 
Development Adapation 


Spanish-dominant ESL 
students in CA 


Well matched on 
demographics and 
PPVT pretests. 


Enq. Woodcock 






Grade 1 




+1.36 


Grade 2 




+0.46 


Grade 3 




-0.09 


SFA-English Language 
Development Adapation 


Other ESL students in 
CA 


Well matched on 
demographics and 
PPVT pretests. 


Eng. Woodcock 






Grade 1 




+0.24 


Grade 2 




+0.37 


Grade 3 




+0.05 


Slavin & 
Madden 
(1995) 


SFA-English Language 
Development Adapation 


Matched 

control 


5 yrs 


50 in 2 
schools 


K 


Asian students in 
Philadelphia 


Well matched on 
overall achievement 
level, poverty, and 
other variables 


Eng. Woodcock 


Grade 4 




Word Identification 


+1.54 


+1.49 


Slavin & 
Yampolsky 
(1991) 


Word Attack 


+1.49 


Passage 

Comprehension 


+0.62 


Eng. Woodcock 


Grade 5 




Slavin, 
Leighton & 
Yampolsky 
(1990) 


Word Identification 


+1.40 


+1.33 


Word Attack 


+1.33 


Passage 

Comprehension 


+0.75 


Ross et al 
(1998) 


SFA-English Language 
Development Adapation 


Matched 

control 


1 yr 


540 in 6 
schools 


1 


Tucson, Arizona: 39% 
Hispanic, 67% free 
lunch 


Well matched on 
demographics and 
pretests 


Eng. Woodcock 






Word Identification 


+0.51 


0.52 


Word Attack 


+0.83 


Passage 

Comprehension 


+0.41 


Durrell 


+0.32 


Hurley et al 
(2001) 


SFA 


Compared 
gains to the 
state mean for 
Hispanic 
students 


4 yrs 


95 SFA 
schools 


(K-2)— 

>0-5) 


Hispanic students in TX 


Well matched on 
initial TAAS reading 
scores 


English TAAS 
Reading (Grade 3- 
5) 


+0.28* 


+0.28* (ES from 
school means, 
not individual 
scores) 


Chambers et 
al. (2004a) 


SFA with embedded 
video 


Matched 

control 


1 yr 


455 in 8 
schools 


K-1 


Hispanic students in 
New York City, 
Washington, DC, rural 
AZ, southern CA 


Well matched on 
PPVT 


Eng. Woodcock 






Word Identification 


+0.40 


+0.36 


Word Attack 


+0.36 


Passage 

Comprehension 


+0.21 





Intervention 










Sample 


Evidence of Initial 




Effect 




Study 


Description 


Design 


Duration 


N 


Grade 


Characteristics 


Equality 


Posttest 


Size 


Median ES 


Other Programs 


















Eng. Woodcock 








Embedded video (SFA 
with embedded video vs. 
SFA) 


Random 
assignment of 
schools 












Word Identification 


+0.23 




Chambers et 


1 yr 


172 in 10 


1 


Hispanic students in 


Well matched on 


Word Attack 


+0.36 




al, (2004b) 


schools 


Hartford, CT 


PPVT, Word ID 


Passage 

Comprehension 


+0.16 


+0.20 


















DIBELS Fluency 


+0.07 




















English WRAT 
Reading 


Across 2 grades 








follow up 
study-2 
yrs after 
the 

treatment 










Level II 


+0.44 




Becker & 
Gersten 
(1982) 














Level 1 


+0.50 




Direct Instruction 


Matched 


225 


K-3 


Hispanic ELL students 


Well matched on 


Mean 


+0.47 




control 


in Uvalde, TX 


demographics 


English MAT 




+0.21 














Word Knowledge 


+0.11 
















Reading 


+0.21 




















Total Reading 


+0.16 




















Mean 


+0.16 




















English CTBS Reading 


















Similar on LAS scores 
for cohort 1 (C>E) and 
cohort 2 (C>E) 


Experimental 


75% 


E>C 


Gersten 


Direct Instruction 


Matched 


8 mos 


~35 


1-2 


Asian ELL students 


Control 


19% 


(1985) 


control 


English CTBS Language 


















Experimental 


71% 


E>C 


















Control 


44% 


















Eng. Woodcock 








Phonetic program (Jolly 
Phonics) vs Literature- 
based program (Big 
Books) 












Well matched on 
demographics but not 
on pretests; JP>BB; 
ES=+0.88 on phonics 
knowledge pretests; 
JP>BB; ES=+0.70 on 
reading and writing 
pretests 


Phoneme 
awareness (5 
measures) 


+0.70 


Immediate tests: 
+0.88 


Stuart (1995) 


Matched 

control 


12 wks 


112 


K 


Sylheti-dominant 
students in London 


Delayed tests (1 
year later) 


+0.16 










Reading and 
Spelling (5 
measures) 


+1.06 


Delayed tests: 
+0.34 
















Delayed tests (1 
year later) 


+0.52 
















Well matched on 


Span. Woodcock 






Escamilla 

(1994) 


Reading Recovery in 
Spanish (DLL) 


Matched 

control 


7 mos 


46 


1 


Spanish-dominant 
bilingual students in 


Spanish Aprenda, but 
on Spanish 
observation survey, 
C>E, median ES=- 
0.43 across four 


Spanish Aprenda 


+0.30 


+0.30 








Arizona 


apamsn 

Observation Survey 
IF mpaqnrpql 


+0.84 


+0.84 



Study 


Intervention 

Description 


Design 


Duration 


N 


Grade 


Sample 

Characteristics 


Evidence of Initial 
Equality 


Posttest 


Effect 

Size 


Median ES 


Gunn et al., 
(2000) 


Small group tutoring 
using Direct Instruction 


Random 

assignment 


2 yrs 


122 


K-4 


Low-achieving Hispanic 
students in rural Oregon 


Well-matched on 
English Woodcock- 
Johnson, oral reading 
fluency 


Eng. Woodcock 






Year 1 

-Letter Word +0.22 
-Word Attack +0.70 
-Oral Reading 

Flupnry +n Ifi 




Year 1 +0.22 


Year 2 

-Letter Word +0.46 
-Word Attack +0.91 
-Oral Reading +0.43 
-Vocabulary +0.44 
-Comprehension 
+0.43 




Year 2 +0.44 


Goldenberg 

(1990) 


Use of teacher-created 
booklets at home and at 
school 


Quasi- 

Experimental 


8 mos 


56 


K 


Spanish-dominant 
students in Southern 
CA 


Similar on Bilingual 
Syntax Measure and 
free lunch 


Spanish 






13 measures of 
early literacy 
development 


+0.83 


+0.83 



TABLE 2 

Upper Elementary Reading Programs: Descriptive Information and Effect Sizes for Qualifying Studies 



Study 



Intervention 

description 



Design 



Duration 



Grade 



Sample 

Chars. 



Evidence of 
Initial Equality 



Posttest 



Effect Size 



Median ES 



Spanish TAAS 



Grade 2 



Reading 



+0.30 



Writing 



+0.62 



Calderon et 
al. (1998) 



Bilingual 
Cooperative 
Integrated 
Reading & 
Composition 
(BCIRC) 



English TAAS 



Grade 3 



Matched 

control 



2 yrs 



222 



2-3 



Spanish- 
dominant 
students in El 
Paso, TX 



Well matched on 
demogs. and 
pretests 



Reading 



+0.54 



Writing 



+0.29 



English TAAS 



2 yrs 



Reading 



+0.87 



Language 



+0.38 



English TAAS 



1 yr 



Reading 



+0.33 



Language 



+ 0.22 



Spanish Reading 
+0.30 

Spanish Writing 
+0.62 

English Reading 
+0.54 

English Writing/ 
Language 
+0.29 



Calderon et 
al. (2004) 



Success for All 
with enriched 
transition 



Matched 

control 



1 yr. 



239 in 8 
schools 



Spanish- 
dominant 
students in El 
Paso, TX 



Well matched on 
English and 
Spanish 
Woodcock 



English Woodcock 



Picture Vocabulary 



+ 0.11 



Passage Comprehension 



+0.16 



Word Attack 



+ 0.21 



+0.16 



English only group 



2nd grade-English 
Reading 



+0.34 



Saunders & 
Goldenberg 
(1996) 



Enriched 

transition 



Matched 

control 



1 yr 



140 



2 & 5 



Spanish- 
dominant 
students in 
Southern CA 



Well matched on 
pretests 



5th grade-English 
Reading 



+0.03 



TBE group 



2nd grade-Spanish 
Reading 



+1.36 



5th grade-English 
Reading 



+ 0.68 



English Reading 
+0.19 



Spanish Reading 
+1.36 



English Reading 
+ 0.68 



Spanish subgroup 



Spanish Measures 



Saunders & 
Goldenberg 
(1999) 



Enriched 

transition 



Matched 

control 



3 yrs 



102 



1-5 



Spanish and 
Cantonese 
speaking 
students in 
Southern CA 



Well matched on 
% of LEP, SES, 
ethnicity, and 
achievement 
scores 



1st grade 



Reading 



- 0.02 



2nd grade 



+0.26 



3rd grade 



+0.38 



4th grade 



+0.59 



Language 



+ 0.11 



+ 0.20 



+0.27 



+0.38 



Cantonese subgroup 



English measures 



4th grade 



+0.53 



5th grade 



+0.80 



+1.77 



+0.78 



Study 


Intervention 

description 


Design 


Duration 


N 


Grade 


Sample 

Chars. 


Evidence of 
Initial Equality 


Posttest 


Effect Size 


Median ES 


















Eng Vocab Assessment 






Carlo et al. 
(2004) 


Direct 


Matched 

control 








ELL students 


Well matched on 
pretests 


PPVT 


-0.08 




instruction in 


2 yrs 


-130 


4 & 5 


in CA, VA, and 


Polysemy prod 


+0.33 


+0.21 


key vocabulary 








MA 


Morphology 


+0.22 




















Semantic Association 


+0.21 




















Enq Readinq Comp 


+0.17 




Perez (1981) 


Oral language 
activity 


Matched 

control 


3 mos 


150 


3 


Mexican 
American ELL 
students in TX 


Well matched on 
demographics 
and pretests, 
E>C, ES=+0.15 


English 


+0.97 


+0.97 


















Enqlish-Read Well 








Read Well 
(Tutoring using 
systematic 
phonics) 












Well matched on 
WRMT pretests; 
E>C, ES=+0.32 
(0.3<p<0.6) 


Word Identification 


+0.55 
















Word Attack 


+0.46 










33 






Passage Comprehension 


+0.00 


+0.51 












Spanish- 
dominant 
bilingual 
students in 
Texas 


Fluency 


+0.18 




Random 








Accuracy 


+0.78 




Denton et al. 




4 mos 




2-5 




Comprehension 


+0.82 




(2004) 




ment 






English-Read Naturally 








Read Naturally 
(Tutoring using 
repeated 
readings) 










Word Identification 


-0.05 














Well matched on 
WRMT pretests 


Word Attack 


-0.13 










60 






Passage Comprehension 


+0.16 


+0.08 














Fluency 


+0.23 
















Accuracy 


+0.30 




















Comprehension 


+0.00 





